Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 29.944
Filtrar
1.
J Acoust Soc Am ; 155(4): R7-R8, 2024 Apr 01.
Artigo em Inglês | MEDLINE | ID: mdl-38558083

RESUMO

The Reflections series takes a look back on historical articles from The Journal of the Acoustical Society of America that have had a significant impact on the science and practice of acoustics.


Assuntos
Percepção da Fala , Acústica , Acústica da Fala , Cognição
2.
Artigo em Chinês | MEDLINE | ID: mdl-38561257

RESUMO

Objective: This study investigates the effect of signal-to-noise ratio (SNR), frequency, and bandwidth on horizontal sound localization accuracy in normal-hearing young adults. Methods: From August 2022 to December 2022, a total of 20 normal-hearing young adults, including 7 males and 13 females, with an age range of 20 to 35 years and a mean age of 25.4 years, were selected to participate in horizontal azimuth recognition tests under both quiet and noisy conditions. Six narrowband filtered noise stimuli were used with central frequencies (CF) of 250, 2 000, and 4 000 Hz and bandwidths of 1/6 and 1 octave. Continuous broadband white noise was used as the background masker, and the signal-to-noise ratio (SNR) was 0, -3, and -12 dB. The root-mean-square error (RMS error) was used to measure sound localization accuracy, with smaller values indicating higher accuracy. Friedman test was used to compare the effects of SNR and CF on sound localization accuracy, and Wilcoxon signed-rank test was used to compare the impact of the two bandwidths on sound localization accuracy in noise. Results: In a quiet environment, the RMS error in horizontal azimuth in normal-hearing young adults ranged from 4.3 to 8.1 degrees. Sound localization accuracy decreased with decreasing SNR: at 0 dB SNR (range: 5.3-12.9 degrees), the difference from the quiet condition was not significant (P>0.05); however, at -3 dB (range: 7.3-16.8 degrees) and -12 dB SNR (range: 9.4-41.2 degrees), sound localization accuracy significantly decreased compared to the quiet condition (all P<0.01). Under noisy conditions, there were differences in sound localization accuracy among stimuli with different frequencies and bandwidths, with higher frequencies performing the worst, followed by middle frequencies, and lower frequencies performing the best, with significant differences (all P<0.01). Sound localization accuracy for 1/6 octave stimuli was more susceptible to noise interference than 1 octave stimuli (all P<0.01). Conclusions: The ability of normal-hearing young adults to localize sound in the horizontal plane in the presence of noise is influenced by SNR, CF, and bandwidth. Noise with SNRs of ≥-3 dB can lead to decreased accuracy in narrowband sound localization. Higher CF signals and narrower bandwidths are more susceptible to noise interference.


Assuntos
Localização de Som , Percepção da Fala , Masculino , Feminino , Humanos , Adulto Jovem , Adulto , Ruído , Razão Sinal-Ruído , Audição
3.
Sci Rep ; 14(1): 8181, 2024 04 08.
Artigo em Inglês | MEDLINE | ID: mdl-38589483

RESUMO

Temporal envelope modulations (TEMs) are one of the most important features that cochlear implant (CI) users rely on to understand speech. Electroencephalographic assessment of TEM encoding could help clinicians to predict speech recognition more objectively, even in patients unable to provide active feedback. The acoustic change complex (ACC) and the auditory steady-state response (ASSR) evoked by low-frequency amplitude-modulated pulse trains can be used to assess TEM encoding with electrical stimulation of individual CI electrodes. In this study, we focused on amplitude modulation detection (AMD) and amplitude modulation frequency discrimination (AMFD) with stimulation of a basal versus an apical electrode. In twelve adult CI users, we (a) assessed behavioral AMFD thresholds and (b) recorded cortical auditory evoked potentials (CAEPs), AMD-ACC, AMFD-ACC, and ASSR in a combined 3-stimulus paradigm. We found that the electrophysiological responses were significantly higher for apical than for basal stimulation. Peak amplitudes of AMFD-ACC were small and (therefore) did not correlate with speech-in-noise recognition. We found significant correlations between speech-in-noise recognition and (a) behavioral AMFD thresholds and (b) AMD-ACC peak amplitudes. AMD and AMFD hold potential to develop a clinically applicable tool for assessing TEM encoding to predict speech recognition in CI users.


Assuntos
Implante Coclear , Implantes Cocleares , Percepção da Fala , Adulto , Humanos , Psicoacústica , Percepção da Fala/fisiologia , Fala , Estimulação Acústica , Potenciais Evocados Auditivos/fisiologia
4.
BMC Neurol ; 24(1): 115, 2024 Apr 08.
Artigo em Inglês | MEDLINE | ID: mdl-38589815

RESUMO

BACKGROUND: Although cochlear implants can restore auditory inputs to deafferented auditory cortices, the quality of the sound signal transmitted to the brain is severely degraded, limiting functional outcomes in terms of speech perception and emotion perception. The latter deficit negatively impacts cochlear implant users' social integration and quality of life; however, emotion perception is not currently part of rehabilitation. Developing rehabilitation programs incorporating emotional cognition requires a deeper understanding of cochlear implant users' residual emotion perception abilities. METHODS: To identify the neural underpinnings of these residual abilities, we investigated whether machine learning techniques could be used to identify emotion-specific patterns of neural activity in cochlear implant users. Using existing electroencephalography data from 22 cochlear implant users, we employed a random forest classifier to establish if we could model and subsequently predict from participants' brain responses the auditory emotions (vocal and musical) presented to them. RESULTS: Our findings suggest that consistent emotion-specific biomarkers exist in cochlear implant users, which could be used to develop effective rehabilitation programs incorporating emotion perception training. CONCLUSIONS: This study highlights the potential of machine learning techniques to improve outcomes for cochlear implant users, particularly in terms of emotion perception.


Assuntos
Implantes Cocleares , Percepção da Fala , Humanos , Qualidade de Vida , Emoções , Eletroencefalografia
5.
Laryngorhinootologie ; 103(4): 252-260, 2024 Apr.
Artigo em Alemão | MEDLINE | ID: mdl-38565108

RESUMO

Language processing can be measured objectively using late components in the evoked brain potential. The most established component in this area of research is the N400 component, a negativity that peaks at about 400 ms after stimulus onset with a centro-parietal maximum. It reflects semantic processing. Its presence, as well as its temporal and quantitative expression, allows to conclude about the quality of processing. It is therefore suitable for measuring speech comprehension in special populations, such as cochlear implant (CI) users. The following is an overview of the use of the N400 component as a tool for studying language processes in CI users. We present studies with adult CI users, where the N400 reflects the quality of speech comprehension with the new hearing device and we present studies with children where the emergence of the N400 component reflects the acquisition of their very first vocabulary.


Assuntos
Implantes Cocleares , Percepção da Fala , Adulto , Criança , Feminino , Humanos , Masculino , Compreensão/fisiologia , Eletroencefalografia , Potenciais Evocados/fisiologia , Idioma , Desenvolvimento da Linguagem , Semântica , Percepção da Fala/fisiologia
6.
J Acoust Soc Am ; 155(4): 2698-2706, 2024 Apr 01.
Artigo em Inglês | MEDLINE | ID: mdl-38639561

RESUMO

The notion of the "perceptual center" or the "P-center" has been put forward to account for the repeated finding that acoustic and perceived syllable onsets do not necessarily coincide, at least in the perception of simple monosyllables or disyllables. The magnitude of the discrepancy between acoustics and perception-the location of the P-center in the speech signal- has proven difficult to estimate, though acoustic models of the effect do exist. The present study asks if the P-center effect can be documented in natural connected speech of English and Japanese and examines if an acoustic model that defines the P-center as the moment of the fastest energy change in a syllabic amplitude envelope adequately reflects the P-center in the two languages. A sensorimotor synchronization paradigm was deployed to address the research questions. The results provide evidence for the existence of the P-center effect in speech of both languages while the acoustic P-center model is found to be less applicable to Japanese. Sensorimotor synchronization patterns further suggest that the P-center may reflect perceptual anticipation of a vowel onset.


Assuntos
Acústica da Fala , Percepção da Fala , Humanos , Fonética , Fala , Idioma
7.
BMJ ; 385: q916, 2024 Apr 19.
Artigo em Inglês | MEDLINE | ID: mdl-38641356
8.
PLoS One ; 19(4): e0301514, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38564597

RESUMO

Evoked potential studies have shown that speech planning modulates auditory cortical responses. The phenomenon's functional relevance is unknown. We tested whether, during this time window of cortical auditory modulation, there is an effect on speakers' perceptual sensitivity for vowel formant discrimination. Participants made same/different judgments for pairs of stimuli consisting of a pre-recorded, self-produced vowel and a formant-shifted version of the same production. Stimuli were presented prior to a "go" signal for speaking, prior to passive listening, and during silent reading. The formant discrimination stimulus /uh/ was tested with a congruent productions list (words with /uh/) and an incongruent productions list (words without /uh/). Logistic curves were fitted to participants' responses, and the just-noticeable difference (JND) served as a measure of discrimination sensitivity. We found a statistically significant effect of condition (worst discrimination before speaking) without congruency effect. Post-hoc pairwise comparisons revealed that JND was significantly greater before speaking than during silent reading. Thus, formant discrimination sensitivity was reduced during speech planning regardless of the congruence between discrimination stimulus and predicted acoustic consequences of the planned speech movements. This finding may inform ongoing efforts to determine the functional relevance of the previously reported modulation of auditory processing during speech planning.


Assuntos
Córtex Auditivo , Percepção da Fala , Humanos , Fala/fisiologia , Percepção da Fala/fisiologia , Acústica , Movimento , Fonética , Acústica da Fala
9.
Cereb Cortex ; 34(4)2024 Apr 01.
Artigo em Inglês | MEDLINE | ID: mdl-38566511

RESUMO

This study investigates neural processes in infant speech processing, with a focus on left frontal brain regions and hemispheric lateralization in Mandarin-speaking infants' acquisition of native tonal categories. We tested 2- to 6-month-old Mandarin learners to explore age-related improvements in tone discrimination, the role of inferior frontal regions in abstract speech category representation, and left hemisphere lateralization during tone processing. Using a block design, we presented four Mandarin tones via [ta] and measured oxygenated hemoglobin concentration with functional near-infrared spectroscopy. Results showed age-related improvements in tone discrimination, greater involvement of frontal regions in older infants indicating abstract tonal representation development and increased bilateral activation mirroring native adult Mandarin speakers. These findings contribute to our broader understanding of the relationship between native speech acquisition and infant brain development during the critical period of early language learning.


Assuntos
Percepção da Fala , Fala , Adulto , Lactente , Humanos , Idoso , Percepção da Fala/fisiologia , Percepção da Altura Sonora/fisiologia , Desenvolvimento da Linguagem , Encéfalo/diagnóstico por imagem , Encéfalo/fisiologia
10.
Cereb Cortex ; 34(4)2024 Apr 01.
Artigo em Inglês | MEDLINE | ID: mdl-38566510

RESUMO

Statistical learning (SL) is the ability to detect and learn regularities from input and is foundational to language acquisition. Despite the dominant role of SL as a theoretical construct for language development, there is a lack of direct evidence supporting the shared neural substrates underlying language processing and SL. It is also not clear whether the similarities, if any, are related to linguistic processing, or statistical regularities in general. The current study tests whether the brain regions involved in natural language processing are similarly recruited during auditory, linguistic SL. Twenty-two adults performed an auditory linguistic SL task, an auditory nonlinguistic SL task, and a passive story listening task as their neural activation was monitored. Within the language network, the left posterior temporal gyrus showed sensitivity to embedded speech regularities during auditory, linguistic SL, but not auditory, nonlinguistic SL. Using a multivoxel pattern similarity analysis, we uncovered similarities between the neural representation of auditory, linguistic SL, and language processing within the left posterior temporal gyrus. No other brain regions showed similarities between linguistic SL and language comprehension, suggesting that a shared neurocomputational process for auditory SL and natural language processing within the left posterior temporal gyrus is specific to linguistic stimuli.


Assuntos
Aprendizagem , Percepção da Fala , Adulto , Humanos , Idioma , Linguística , Desenvolvimento da Linguagem , Encéfalo , Percepção da Fala/fisiologia , Mapeamento Encefálico , Imageamento por Ressonância Magnética
11.
IEEE J Transl Eng Health Med ; 12: 382-389, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38606392

RESUMO

Acoustic features extracted from speech can help with the diagnosis of neurological diseases and monitoring of symptoms over time. Temporal segmentation of audio signals into individual words is an important pre-processing step needed prior to extracting acoustic features. Machine learning techniques could be used to automate speech segmentation via automatic speech recognition (ASR) and sequence to sequence alignment. While state-of-the-art ASR models achieve good performance on healthy speech, their performance significantly drops when evaluated on dysarthric speech. Fine-tuning ASR models on impaired speech can improve performance in dysarthric individuals, but it requires representative clinical data, which is difficult to collect and may raise privacy concerns. This study explores the feasibility of using two augmentation methods to increase ASR performance on dysarthric speech: 1) healthy individuals varying their speaking rate and loudness (as is often used in assessments of pathological speech); 2) synthetic speech with variations in speaking rate and accent (to ensure more diverse vocal representations and fairness). Experimental evaluations showed that fine-tuning a pre-trained ASR model with data from these two sources outperformed a model fine-tuned only on real clinical data and matched the performance of a model fine-tuned on the combination of real clinical data and synthetic speech. When evaluated on held-out acoustic data from 24 individuals with various neurological diseases, the best performing model achieved an average word error rate of 5.7% and a mean correct count accuracy of 94.4%. In segmenting the data into individual words, a mean intersection-over-union of 89.2% was obtained against manual parsing (ground truth). It can be concluded that emulated and synthetic augmentations can significantly reduce the need for real clinical data of dysarthric speech when fine-tuning ASR models and, in turn, for speech segmentation.


Assuntos
Percepção da Fala , Fala , Humanos , Interface para o Reconhecimento da Fala , Disartria/diagnóstico , Distúrbios da Fala
12.
Anim Cogn ; 27(1): 34, 2024 Apr 16.
Artigo em Inglês | MEDLINE | ID: mdl-38625429

RESUMO

Humans have an impressive ability to comprehend signal-degraded speech; however, the extent to which comprehension of degraded speech relies on human-specific features of speech perception vs. more general cognitive processes is unknown. Since dogs live alongside humans and regularly hear speech, they can be used as a model to differentiate between these possibilities. One often-studied type of degraded speech is noise-vocoded speech (sometimes thought of as cochlear-implant-simulation speech). Noise-vocoded speech is made by dividing the speech signal into frequency bands (channels), identifying the amplitude envelope of each individual band, and then using these envelopes to modulate bands of noise centered over the same frequency regions - the result is a signal with preserved temporal cues, but vastly reduced frequency information. Here, we tested dogs' recognition of familiar words produced in 16-channel vocoded speech. In the first study, dogs heard their names and unfamiliar dogs' names (foils) in vocoded speech as well as natural speech. In the second study, dogs heard 16-channel vocoded speech only. Dogs listened longer to their vocoded name than vocoded foils in both experiments, showing that they can comprehend a 16-channel vocoded version of their name without prior exposure to vocoded speech, and without immediate exposure to the natural-speech version of their name. Dogs' name recognition in the second study was mediated by the number of phonemes in the dogs' name, suggesting that phonological context plays a role in degraded speech comprehension.


Assuntos
Percepção da Fala , Fala , Humanos , Animais , Cães , Sinais (Psicologia) , Audição , Linguística
13.
J Speech Lang Hear Res ; 67(4): 1229-1242, 2024 Apr 08.
Artigo em Inglês | MEDLINE | ID: mdl-38563688

RESUMO

PURPOSE: Almost 40 years after its development, in this article, we reexamine the relevance and validity of the ubiquitously used Revised Speech Perception in Noise (R-SPiN) sentence corpus. The R-SPiN corpus includes "high-context" and "low-context" sentences and has been widely used in the field of hearing research to examine the benefit derived from semantic context across English-speaking listeners, but research investigating age differences has yielded somewhat inconsistent findings. We assess the appropriateness of the corpus for use today in different English-language cultures (i.e., British and American) as well as for older and younger adults. METHOD: Two hundred forty participants, including older (60-80 years) and younger (19-31 years) adult groups in the the United Kingdom and United States, completed a cloze task consisting of R-SPiN sentences with the final word removed. Cloze, as a measure of predictability, and entropy, as a measure of response uncertainty, were compared between culture and age groups. RESULTS: Most critically, of the 200 "high-context" stimuli, only around half were assessed as highly predictable for older adults (United Kingdom: 109; United States: 107); and fewer still, for younger adults (United Kingdom: 75; United States: 81). We also found dominant responses to these "high-context" stimuli varied between cultures, with U.S. responses being more likely to match the original R-SPiN target. CONCLUSIONS: Our findings highlight the issue of incomplete transferability of corpus items across English-language cultures as well as diminished equivalency for older and younger adults. By identifying relevant items for each population, this work could facilitate the interpretation of inconsistent findings in the literature, particularly relating to age effects.


Assuntos
Percepção da Fala , Humanos , Idoso , Ruído , Audição/fisiologia , Idioma , Semântica
14.
Codas ; 36(3): e20230175, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38629682

RESUMO

PURPOSE: To assess the influence of the listener experience, measurement scales and the type of speech task on the auditory-perceptual evaluation of the overall severity (OS) of voice deviation and the predominant type of voice (rough, breathy or strain). METHODS: 22 listeners, divided into four groups participated in the study: speech-language pathologist specialized in voice (SLP-V), SLP non specialized in voice (SLP-NV), graduate students with auditory-perceptual analysis training (GS-T), and graduate students without auditory-perceptual analysis training (GS-U). The subjects rated the OS of voice deviation and the predominant type of voice of 44 voices by visual analog scale (VAS) and the numerical scale (score "G" from GRBAS), corresponding to six speech tasks such as sustained vowel /a/ and /ɛ/, sentences, number counting, running speech, and all five previous tasks together. RESULTS: Sentences obtained the best interrater reliability in each group, using both VAS and GRBAS. SLP-NV group demonstrated the best interrater reliability in OS judgment in different speech tasks using VAS or GRBAS. Sustained vowel (/a/ and /ɛ/) and running speech obtained the best interrater reliability among the groups of listeners in judging the predominant vocal quality. GS-T group got the best result of interrater reliability in judging the predominant vocal quality. CONCLUSION: The time of experience in the auditory-perceptual judgment of the voice, the type of training to which they were submitted, and the type of speech task influence the reliability of the auditory-perceptual evaluation of vocal quality.


Assuntos
Disfonia , Percepção da Fala , Humanos , Fala , Reprodutibilidade dos Testes , Medida da Produção da Fala , Variações Dependentes do Observador , Qualidade da Voz , Acústica da Fala
15.
BMC Res Notes ; 17(1): 107, 2024 Apr 17.
Artigo em Inglês | MEDLINE | ID: mdl-38632663

RESUMO

OBJECTIVE: Early detection and effective management of hearing loss constitute the key to improving the quality of life of individuals with hearing loss. However, in standardized pure tone audiometry, it is sometimes difficult for elderly patients to understand and follow all instructions. Audiologists also require time, expertise, and patience to ensure that an elderly can identify the faintest levels of stimuli during a hearing test. Therefore, this study aimed to devise and validate a formula to predict the pure tone threshold at each frequency across 0.5-4 kHz (PTTs) using speech reception threshold. METHODS: The 1226 audiograms of hearing-impaired individuals aged 60-90 years were reviewed. The random sample function randomly assigned 613 participants to the training and testing sets each. A linear model was created to predict the PTT value at each frequency based on variables significant at all frequencies across 0.5-4 kHz. The adjusted-R2 value was considered to indicate the performance of the predictive model. Pearson's correlation coefficient was used to describe the relationship between the actual and predicted PTT at 0.5, 1, 2, and 4 kHz among the testing set to measure the performance of the proposed model. RESULTS: The predictive model was devised using variables based on the speech recognition threshold (SRT) after adjusting with age in the training set. The overall prediction accuracy demonstrated a higher adjusted-R2 ranging from 0.74 to 0.89 at frequencies of 0.5, 1, and 2 kHz, whereas a low percentage of explained variance was observed at 4 kHz (adjusted-R2 = 0.41). This predictive model can serve as an adjunctive clinical tool for guiding determination of the PTTs. Moreover, the predicted PTTs can be applied in the hearing aid programming software to set appropriate hearing aid gain using standard prescriptive formulas.


Assuntos
Perda Auditiva , Percepção da Fala , Idoso , Humanos , Audição , Qualidade de Vida , Fala , Teste do Limiar de Recepção da Fala , Pessoa de Meia-Idade , Idoso de 80 Anos ou mais
16.
PLoS One ; 19(4): e0299746, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38635575

RESUMO

In this exploratory study, we investigate the influence of several semantic-pragmatic and syntactic factors on prosodic prominence production in German, namely referential and lexical newness/givenness, grammatical role, and position of a referential target word within a sentence. Especially in terms of the probabilistic distribution of accent status (nuclear, prenuclear, deaccentuation) we find evidence for an additive influence of the discourse-related and syntactic cues, with lexical newness and initial sentence position showing the strongest boosting effects on a target word's prosodic prominence. The relative strength of the initial position is found in nearly all prosodic factors investigated, both discrete (such as the choice of accent type) and gradient (e.g., scaling of the Tonal Center of Gravity and intensity). Nevertheless, the differentiation of prominence relations is information-structurally less important in the beginning of an utterance than near the end: The prominence of the final object relative to the surrounding elements, especially the verbal component, is decisive for the interpretation of the sentence. Thus, it seems that a speaker adjusts locally important prominence relations (object vs. verb in sentence-final position) in addition to a more global, rhythmically determined distribution of prosodic prominences across an utterance.


Assuntos
Semântica , Percepção da Fala , Sinais (Psicologia) , Idioma
17.
Trends Hear ; 28: 23312165241246616, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38656770

RESUMO

Negativity bias is a cognitive bias that results in negative events being perceptually more salient than positive ones. For hearing care, this means that hearing aid benefits can potentially be overshadowed by adverse experiences. Research has shown that sustaining focus on positive experiences has the potential to mitigate negativity bias. The purpose of the current study was to investigate whether a positive focus (PF) intervention can improve speech-in-noise abilities for experienced hearing aid users. Thirty participants were randomly allocated to a control or PF group (N = 2 × 15). Prior to hearing aid fitting, all participants filled out the short form of the Speech, Spatial and Qualities of Hearing scale (SSQ12) based on their own hearing aids. At the first visit, they were fitted with study hearing aids, and speech-in-noise testing was performed. Both groups then wore the study hearing aids for two weeks and sent daily text messages reporting hours of hearing aid use to an experimenter. In addition, the PF group was instructed to focus on positive listening experiences and to also report them in the daily text messages. After the 2-week trial, all participants filled out the SSQ12 questionnaire based on the study hearing aids and completed the speech-in-noise testing again. Speech-in-noise performance and SSQ12 Qualities score were improved for the PF group but not for the control group. This finding indicates that the PF intervention can improve subjective and objective hearing aid benefits.


Assuntos
Correção de Deficiência Auditiva , Auxiliares de Audição , Ruído , Pessoas com Deficiência Auditiva , Inteligibilidade da Fala , Percepção da Fala , Humanos , Masculino , Feminino , Idoso , Ruído/efeitos adversos , Pessoa de Meia-Idade , Correção de Deficiência Auditiva/instrumentação , Pessoas com Deficiência Auditiva/reabilitação , Pessoas com Deficiência Auditiva/psicologia , Mascaramento Perceptivo , Perda Auditiva/reabilitação , Perda Auditiva/psicologia , Perda Auditiva/diagnóstico , Audiometria da Fala , Inquéritos e Questionários , Idoso de 80 Anos ou mais , Fatores de Tempo , Estimulação Acústica , Audição , Resultado do Tratamento
18.
Cogn Res Princ Implic ; 9(1): 25, 2024 Apr 23.
Artigo em Inglês | MEDLINE | ID: mdl-38652383

RESUMO

The use of face coverings can make communication more difficult by removing access to visual cues as well as affecting the physical transmission of speech sounds. This study aimed to assess the independent and combined contributions of visual and auditory cues to impaired communication when using face coverings. In an online task, 150 participants rated videos of natural conversation along three dimensions: (1) how much they could follow, (2) how much effort was required, and (3) the clarity of the speech. Visual and audio variables were independently manipulated in each video, so that the same video could be presented with or without a superimposed surgical-style mask, accompanied by one of four audio conditions (either unfiltered audio, or audio-filtered to simulate the attenuation associated with a surgical mask, an FFP3 mask, or a visor). Hypotheses and analyses were pre-registered. Both the audio and visual variables had a statistically significant negative impact across all three dimensions. Whether or not talkers' faces were visible made the largest contribution to participants' ratings. The study identifies a degree of attenuation whose negative effects can be overcome by the restoration of visual cues. The significant effects observed in this nominally low-demand task (speech in quiet) highlight the importance of the visual and audio cues in everyday life and that their consideration should be included in future face mask designs.


Assuntos
Sinais (Psicologia) , Percepção da Fala , Humanos , Adulto , Feminino , Masculino , Adulto Jovem , Percepção da Fala/fisiologia , Percepção Visual/fisiologia , Máscaras , Adolescente , Fala/fisiologia , Comunicação , Pessoa de Meia-Idade , Reconhecimento Facial/fisiologia
19.
Trends Hear ; 28: 23312165241246597, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38629486

RESUMO

Hearing aids and other hearing devices should provide the user with a benefit, for example, compensate for effects of a hearing loss or cancel undesired sounds. However, wearing hearing devices can also have negative effects on perception, previously demonstrated mostly for spatial hearing, sound quality and the perception of the own voice. When hearing devices are set to transparency, that is, provide no gain and resemble open-ear listening as well as possible, these side effects can be studied in isolation. In the present work, we conducted a series of experiments that are concerned with the effect of transparent hearing devices on speech perception in a collocated speech-in-noise task. In such a situation, listening through a hearing device is not expected to have any negative effect, since both speech and noise undergo identical processing, such that the signal-to-noise ratio at ear is not altered and spatial effects are irrelevant. However, we found a consistent hearing device disadvantage for speech intelligibility and similar trends for rated listening effort. Several hypotheses for the possible origin for this disadvantage were tested by including several different devices, gain settings and stimulus levels. While effects of self-noise and nonlinear distortions were ruled out, the exact reason for a hearing device disadvantage on speech perception is still unclear. However, a significant relation to auditory model predictions demonstrate that the speech intelligibility disadvantage is related to sound quality, and is most probably caused by insufficient equalization, artifacts of frequency-dependent signal processing and processing delays.


Assuntos
Auxiliares de Audição , Perda Auditiva , Percepção da Fala , Humanos , Audição , Ruído/efeitos adversos
20.
Hum Brain Mapp ; 45(4): e26653, 2024 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-38488460

RESUMO

Face-to-face communication relies on the integration of acoustic speech signals with the corresponding facial articulations. In the McGurk illusion, an auditory /ba/ phoneme presented simultaneously with a facial articulation of a /ga/ (i.e., viseme), is typically fused into an illusory 'da' percept. Despite its widespread use as an index of audiovisual speech integration, critics argue that it arises from perceptual processes that differ categorically from natural speech recognition. Conversely, Bayesian theoretical frameworks suggest that both the illusory McGurk and the veridical audiovisual congruent speech percepts result from probabilistic inference based on noisy sensory signals. According to these models, the inter-sensory conflict in McGurk stimuli may only increase observers' perceptual uncertainty. This functional magnetic resonance imaging (fMRI) study presented participants (20 male and 24 female) with audiovisual congruent, McGurk (i.e., auditory /ba/ + visual /ga/), and incongruent (i.e., auditory /ga/ + visual /ba/) stimuli along with their unisensory counterparts in a syllable categorization task. Behaviorally, observers' response entropy was greater for McGurk compared to congruent audiovisual stimuli. At the neural level, McGurk stimuli increased activations in a widespread neural system, extending from the inferior frontal sulci (IFS) to the pre-supplementary motor area (pre-SMA) and insulae, typically involved in cognitive control processes. Crucially, in line with Bayesian theories these activation increases were fully accounted for by observers' perceptual uncertainty as measured by their response entropy. Our findings suggest that McGurk and congruent speech processing rely on shared neural mechanisms, thereby supporting the McGurk illusion as a valid measure of natural audiovisual speech perception.


Assuntos
Ilusões , Percepção da Fala , Humanos , Masculino , Feminino , Percepção Auditiva/fisiologia , Fala/fisiologia , Ilusões/fisiologia , Percepção Visual/fisiologia , Teorema de Bayes , Incerteza , Percepção da Fala/fisiologia , Estimulação Acústica/métodos , Estimulação Luminosa/métodos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...